Predicting Iran’s achievement to Sustainable Development Goal 3.2: A systematic analysis of neonatal mortality with scenario-based projections to 2030

Background Sustainable Development Goal 3.2 (SDG 3.2) is to reduce Under-5 and neonatal mortality rates (U5MR and NMR), two major health systems’ performance indicators, globally by 2030. We aimed to report Iran’s U5MR and NMR status during 2010–2017 and its achievement of SDG 3.2 by 2030, using scenario-based projection. Study design To estimate the national and subnational levels of U5MR and NMR, we applied an Ensemble Bayesian Model Averaging (EBMA) with Gaussian Process Regression (GPR) and Spatio_temporal models. We used all available data sources including: 12-year data from the Death Registration System (DRS), two censuses, and a demographic and health surveys (DHS). This study employed two approaches, Maternal Age Cohort (MAC) and Maternal Age Period (MAP), to analyze summary birth history data obtained from censuses and DHS. In addition, we calculated the child mortality rate directly from DHS using the complete birth history method. National and subnational NMR was projected up to 2030 with a scenario-based method using average Annual Rate of Reduction (ARR) introduced by UN-IGME. Results In 2017, national U5MR and NMR were 15·2 (12·4–18·0) and 11·8 (10·4–13·2), with an average ARR of 5·1% (2·1–8·9) and 3·1% (0·9–5·8) during 2010–2017, respectively. According to our projection scenarios, 17 provinces have not fulfilled SDG 3.2 for NMR yet, and the current trend (the current trend of NMR improvement in Iran) will not result in reaching SDG for some provinces by 2030; However, if each province has the same neonatal mortality annual reduction rate as the best-performing province in the same region, besides achieving SDG, the national NMR will be reduced to 5·2, and almost 92,000 newborn lives will be saved. Conclusions Iran has achieved SDG3.2 regarding U5MR and NMR; however, there are provincial inequalities. For all provinces to reach SDG3.2, health policies should focus on reducing provincial inequalities by precise planning for neonatal health care.


Introduction
We used different data sources and reliable methods to estimate neonatal mortality and child mortality rates in Iran at the national and sub-national levels from the NASBOD project. (1) All available data sources were identified, and comprehensive demographic methods were applied to estimate mortality rates. Due to incompleteness and misclassification of death registration in Iran, this study used information on deaths from all related data sources to analyze mortality data. We employed the Bayesian Model Averaging (BMA) statistical method, with an ensemble of Gaussian Process Regression (GPR) and Spatio-Temporal (ST) models to estimate mortality rates. GPR can synthesize data from different sources and provide unified time trends and 95% uncertainty intervals for mortality rates at the national and sub-national levels. GPR is usually defined by its mean and covariance function. For this analysis, we considered the Generalized linear mixed model (GLMM) adjusted for spatial and temporal correlations as the prior for mean function and Matern as the covariance function. After defining non-informative prior for hyperparameters and incorporating empirical data, Markov Chain Monte Carlo (MCMC) method was used to draw samples from posterior distributions of model parameters and predicted mortality rates. GPR also considers both sampling and non-sampling errors in the final uncertainty intervals.
ST model contained three components, spatial, temporal, and spatiotemporal interaction. We used the Besag-York-Mollie (BYM) correlation structure, a conditional autoregressive (CAR) model, to consider the spatial dependency in mortality rates. Non-informative priors were defined for model hyper-parameters, and samples were drawn using the integrated nested Laplace approximations (INLA) method.

Identifying the data sources for neonate and child mortality
There were three data sources for estimating the under-five mortality rate in Iran: Death Registration System (DRS), censuses, and Demographic and Health Surveys (DHS). We also used the 12-year (2006-2017) U5MR of DRS. Direct and indirect methods were incorporated into censuses and DHSs. These data sources are explained in detail below. We also used information on neonatal deaths from DRS for 8-year (2010-2017) to estimate NMRs.

Census
Although there have been many population censuses in Iran in the last decades, the first formal and comprehensive census was performed in 1956. So far, eight censuses have been conducted in Iran. Since 1986, Summary Birth History (SBH) questions have been included in census questionnaires to estimate child mortality indirectly. We used the censuses of 1996, 2006, 2011, and 2016 as indirect data sources to estimate U5MR.

Iranian Demographic and Health Survey (IrDHS)
DHS is one of the primary data sources for child mortality estimation. DHS collects comparable population-based data on fertility, contraception, maternal and child health, and nutrition in developing countries and is funded by the United States Agency for International Development (USAID). One of the sections in DHS questionnaires is related to the child's health. This section includes questions of SBH and CBH. (2)

Analysis of SBH data
In this stage, we used the methodology developed by IHME for analyzing SBH data. In this method, SBH data are analyzed by two different approaches: cohort and period. (4)

Maternal Age Cohort (MAC)
Maternal Age Cohort (MAC) method is a modification of WHO version in two ways. (5) One is that it directly estimates the under-five mortality rate, and the second is that the country's random effect is incorporated into the model. This method is based on two regression equations. In this method, under-five mortality and reference time equations are estimated using summary birth history data with the mother's age as a proxy for children. Then, by calculating the antilog of the logit of the mortality rate, it is converted into the under-five mortality rate.

Component of reference time is:
The logit component of under-five mortality rate is: For each age group of mothers, one U5MR and one reference time are calculated.

Maternal Age Period (MAP)
In the second approach, Maternal Age Period (MAP) method finds the distribution of child birthdays and death days for different categories of mothers, stratified by region, age, and the number of children ever born/dead children. The expected numbers of children ever born (CEB) and children dead (CD) were calculated using these distributions for every year before the survey.
In the next stage, we added the numbers of CEB and CD for all strata and every year. Finally, we calculated the ratio of CD to CEB and found the under-five mortality. This method has one regression equation. Below is the equation of the MAP method: In this method, the subscripts are: For every year before the survey, one U5MR is obtained. (4)

Combining two indirect estimates using LOESS regression
We smoothed and combined two estimates obtained from MAC and MAP by using Loess.
Loess has the advantage of not limiting the estimates to a linear trend over time. Here, we excluded the U5MR obtained from the 15-19 years old mothers because MAC estimates of this age group are overestimated. (4)

Analysis of complete birth history data
At first, to analyze CMR using CBH, we converted the date of birth, date of death, and age of death of participants into Century Month Code (CMC) format.

Analysis of Death Registration data
A common problem related to analyzing DRS data is missing values of age and sex. To estimate U5MRs, subjects with missing age values will drop out of the numerator of calculated rates. We have used Amelia II package (R 3.5.3) to impute missing values of age and sex variables. Amelia implements a bootstrapping-based algorithm to perform multiple imputations, and it includes useful diagnostics for imputed values. (7) Besides the problem of missing values, the death registration system's incompleteness leads to underestimating mortality rates. Therefore, we should adjust for the degree of incompleteness in the analysis. We estimated DRS's completeness using a model that compares the level of U5MRs from DRS with the other data sources. (8) In the first step, we apply a sub-national level regression of the logarithm of mortality rates on year and an indicator variable adjusting for the incompleteness in the level of DRS mortality rates: Where t is time (year), I is indicator variable for mortality rates that are obtained from DRS and is the error term. The magnitude of the coefficient for this indicator shows the completeness of DRS data. If the coefficient of the indicator variable is statistically significant at level 0.05, we conclude that DRS has incompleteness for that province. As is shown in the equation, this model allows completeness to vary over time. Completeness and its standard error estimated in this stage will be used in GPR model, which is explained in the following section.

Gaussian Process Regression Model
Gaussian Process is a generalization of a Gaussian (normal) probability distribution, which extends any finite linear combination of random variables to functions. (9) Like multivariate normal distribution, GP has a mean and covariance with the difference that these mean and covariance can be functions. A GP is entirely described by its mean and covariance function. These two functions are determined by data information and properties. For this reason, the Bayesian framework is applied for improving the model to conclude accurate and more reliable estimates. Since our target values are mortality rates over time, we used Gaussian Process Regression to draw an inference based on these values. (10) We intend to combine time series of different data sources for a unique estimate of U5MRs with a 95% uncertainty interval.
We considered the logarithm of under-five mortality rates to have a normal distribution. The mean is a function of year and an indicator variable that shows whether mortality rates are from DRS.
The variance of mortality rates includes both sampling and non-sampling errors described in the following parts. Since in some provinces DRS undercount child deaths, the bias term ( DRS  ) is added to mortality rates to adjust for incompleteness. If the indicator coefficient for a province was significant, we set a normal prior for the incompleteness coefficients with the mean equal to the Mixed Models (GLMM) adjusted for spatial and temporal correlations, and covariance function C is a Matern function described in the following sections.

Mean function
For the mean function, we used a GLM model adjusted for correlation between neighboring provinces and nearby time points. This model uses this information to predict for province-years where no data or sparse data is available. We have used the additional information of covariates (including wealth index, years of schooling, and urbanization) 1 to make this prediction more accurate. The model specification is defined in the following equation: (5 0) = 0 + 1 + 2 + 3 + 0 + 1 + Where: 1. WI, YOS, and Urban is abbreviations for the mean wealth index, mean years of schooling, and percent of urbanization in each province-year.
2. 0 is the random province effect and allows each province to have had the mean of CMR which, was different from other provinces. 0 has a Normal (0, 0 2 ) prior.
1. 1 term allows changes in log (5 0) over time in each province to be different from other provinces. 1 has a Normal (0, 1 2 ) prior.
Then, to calculate changes of dependent variable across time and space, we applied an additional regression analysis. In this order, we estimated the residual (predicted -observed dependent variable) of GLMM for each data point and then ran a local regression in two dimensions on the residual. Considering that the residuals contain valuable information that varies systematically across place and time but cannot be directly observed, this process allows us to predict how much the observed dependent variable varies from the mixed-effects model's prediction and account for these differences. Local regressions were done by cycling through every observation in the dataset, weighting each observation in the dataset relative to it, and finding the residual s's weighted mean.
We estimated adjacency matrices over province and year. Then by using a weighting scheme similar to the tricubic weights used in traditional LOESS local regression, we weighted all observations j relative to observation i in time: Where = 2 is a parameter that can be adopted to raise or decline the level of smoothing across time. We calculated the weight matrix for adjacent provinces whose elements for provinces that are neighbors equal 1 and 0 otherwise. Once these weights have been calculated, weighting every observation in the data set relative to the one being predicted, it a simple matter of calculating a weighted average of the residuals from regression. This "predicted residual" is then added back onto the prediction, creating an estimate that more closely considers aspects of the data that cannot be captured by a simple covariate model. (11)

Covariance function
The covariance function in GP determines how each data point can borrow information from neighboring points. Finding a suitable covariance function of GPR is a critical part of the analysis because it adjusts data variation by its parameters. Among the various covariance functions for GP, Matern function was chosen because it is a suitable one for time series data, and its parameters entirely define it. Matern defined for two points separated by a distance d as: The calculated values from C (d) are placed in a symmetric matrix of prediction years. This function includes three parameters: one is called "amplitude," which calculates the deviation between under-five mortality rates and GP mean function over time; another is the "scale" parameter, which captures the correlation between years. This parameter is defined by uniform distribution and updated by Bayesian inferences. The third parameter, "degree of differentiability," controls samples' smoothness from the Gaussian Process. Mortality rates obtained from DHS and censuses have a disagreement in reporting values and need to be made smoother. We assume less degree of smoothness for death registration observations and a greater degree of smoothness for observation from other mortality sources. We set the same values that IHME used to maintain comparability.

Uncertainty with Sampling and Non-sampling Variances
According to existing different data sources for estimating child mortality rates and considering sampling variance, we are interested in estimating the disagreement between data sources in terms of uncertainty. A critical output from GPR is an estimate of both the expected value of mortality rates and their uncertainty. The uncertainty estimate captures both uncertainties in prior mean function and the data variance from each observation. Data variance is a function of both sampling and non-sampling error.
We calculate sampling variance of rates assuming a binomial distribution, with sample size equal to the number of children between birth and exact age five depending on the data source and probability equal to the under-five mortality rate. We choose the size of the birth cohort for DRS data points. The variance that is obtained by using this distribution is the sampling error mortality rate. According to delta method, the variance of this distribution is given by ( 10 (5 0)) = , 2 = 1 (5 0 , log (10)) 2 5 0 , (1 − 5 0 , ) Non-sampling variance captures the variation between data sources. Unlike the sampling variance, we specify a prior distribution for it. The more variation in data sources, the higher the variance parameter and, therefore, the more uncertain the predictions. Registration System. We excluded the variance of DRS' incompleteness in high-quality provinces.
The MAD is median absolute deviation, which has a good performance in the presence of outliers.
Gamma distribution is suitable for estimating variance in Bayesian inferences. The sum of these two variances (sampling and non-sampling) result uncertainty interval.

Markov Chain Monte Carlo
We used Bayesian inference to update predicted mortality rates as a posterior in Bayes rule by combining data and prior probability distribution over parameters in mean, covariance function, and the regression model. These predicted series are updated by the data within each province using GPR model. We employed Markov Chain Monte Carlo (MCMC) to sample from posterior probability distribution, describing the probability of a true model given the under-five mortality rates that we observe in a province. Except in special circumstances, the posterior distribution for model parameters cannot be calculated analytically and must be approximated. Using MCMC, we draw samples from the model's posterior distribution to estimate the median and 95% uncertainty intervals of the mortality rate across provinces over the years. We draw 12000 samples for every province after discarding the first 6000 burn-in samples. Across all samples for a given provinceyear combination, we used median values, 2·5 and 97·5 percentiles of each outcome as the point estimate and upper and lower confidence intervals.
To do MCMC calculations, we used RStan package version 2.19.2 in R software. RStan can employ complex new statistical methods (such as Hamiltonian Monte Carlo) in MCMC. It has a similar syntax to BUGS language for Bayesian analysis and compiles C++ codes based on userwritten scripts. By using RStan package, we can apply complex models to the data. It is also an open-source package that is easily accessible through the internet. We employed GPR method, which was developed by GBD study and all the parameters were the same compared to the methodology for under-five mortality in GBD 2015.

Spatio-Temporal Model
In this study, we employed the spatiotemporal model in a Bayesian framework to estimate CMRs in all provinces of Iran. To take the spatial dependency into account, we use Besag-York-Mollie (BYM) correlation structure under conditional autoregressive (CAR) models. We describe the model as follows: (5 0) = 0 + 1 + 2 + 3 + + + + + + Where: 1. is the spatial unstructured random effect in province i and has a Normal (0, 2 ) prior; 2. is the spatial structured random effect in province i and has a conditional autoregressive prior as follows; is interaction term between spatially and temporally structured random effects. We assume the distribution of the interaction effect as: Where and are the adjacency and random walk (RW2) dependency matrices, respectively.
We have used non-informative priors for model hyper-parameters such as variance components and fitted this model using RINLA package version 20.03.17 from R software. (12)

Model ensemble
As described in the main text, we developed an ensemble of GPR and ST models to estimate CMRs. We applied a BMA method to combine these two models and generate final estimates of Where N is the number of models and is a parameter influencing the relative weighting of models. We set = 1.2 based on previous studies. (11)

Misaligned areal unit in spatio-temporal modeling
Increasing the number of provinces during this period (24 provinces in 1990 versus 31 provinces in 2017) was a challenging issue in modeling mortality data. This increase leads to changes in the neighborhood structure of provinces in the Spatio-temporal model. Nonetheless, many commonlyused Spatio-temporal models work with a unique neighborhood matrix throughout the years. One solution for this problem is to aggregate the newly added provinces based on the country's old administrative divisions. This solution is usually known as backward alignment. Since we had access to district codes for many data sources, the misalignment problem can be solved for most provinces. We extracted the district codes based on the Statistical Center of Iran's administrative division in 2011 and then attempted to form the new provinces by aggregating related districts in the preceding years. Unfortunately, for some data sources, including DHS 2000 and the census 1996, the data had been collected only at the provincial level, and there was no information available at the district level. For these data sources, we have used a statistical model that uses U5MRs of old provinces to predict U5MRs for newly added provinces. Imagine province A is from all sources together in a data set and fitted the following model: Where is the fixed effects of provinces (that could be A, B, and C) and is the fixed effects of data sources (three SBH and one CBH). Using this model, we can predict the mortality rates for misaligned provinces of DHS 2000 and the census 1996 according to the recent administrative division of the country.

Statistical analysis for estimating NMR
The NMR was modeled as the product of two components: the expected NMR and a province- year-specific multiplier. The expected NMR provides the expected association between the NMR and the current level of U5MR in a particular province and year. We searched for parametric and non-parametric patterns of the global association between the expected NMR and U5MR, and we found that a quadratic function well captures the association between the log (NMR) and log (U5MR) (figure1). Previous studies investigated the relation between the ratio (log (NMR)/log (U5MR)log (NMR)) and estimated U5MR at the global level. (14) We compared the relation between ratio and estimated U5MR in our data with other studies and found a similar relation. (figure2).
The second component, the province-year-specific multiplier, allows provinces to have NMRs that are higher or lower than expected given the current U5MR, compared with the global association.
The province-specific multipliers were modeled with GPR and ST models. We considered the quadratic form of CMR estimates as explanatory variables in GPR's mean function and ST model to estimate NMR.
Figure2: The left plot shows the relation between the observed log(ratio) and predicted log(U5MR) at the global level (14), and the right plot shows this relation between Iran's mortality data.